Goto

Collaborating Authors

 Representation Of Examples









Metric space valued Fréchet regression

Györfi, László, Humbert, Pierre, Bars, Batiste Le

arXiv.org Machine Learning

We consider the problem of estimating the Fréchet and conditional Fréchet mean from data taking values in separable metric spaces. Unlike Euclidean spaces, where well-established methods are available, there is no practical estimator that works universally for all metric spaces. Therefore, we introduce a computable estimator for the Fréchet mean based on random quantization techniques and establish its universal consistency across any separable metric spaces. Additionally, we propose another estimator for the conditional Fréchet mean, leveraging data-driven partitioning and quantization, and demonstrate its universal consistency when the output space is any Banach space.


Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Neural Information Processing Systems

Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance. By design, such graphs can model arbitrary geometry with a proper configuration of edges and weights. Our main contribution is PRODIGE: a method that learns a weighted graph representation of data end-to-end by gradient descent. Greater generality and fewer model assumptions make PRODIGE more powerful than existing embedding-based approaches. We confirm the superiority of our method via extensive experiments on a wide range of tasks, including classification, compression, and collaborative filtering.


Chromatic Feature Vectors for 2-Trees: Exact Formulas for Partition Enumeration with Network Applications

Allagan, J., Morgan, G., Langley, S., Lopez-Bonilla, R., Deriglazov, V.

arXiv.org Artificial Intelligence

We establish closed-form enumeration formulas for chromatic feature vectors of 2-trees under the bichromatic triangle constraint. These efficiently computable structural features derive from constrained graph colorings where each triangle uses exactly two colors, forbidding monochromatic and rainbow triangles, a constraint arising in distributed systems where components avoid complete concentration or isolation. For theta graphs Theta_n, we prove r_k(Theta_n) = S(n-2, k-1) for k >= 3 (Stirling numbers of the second kind) and r_2(Theta_n) = 2^(n-2) + 1, computable in O(n) time. For fan graphs Phi_n, we establish r_2(Phi_n) = F_{n+1} (Fibonacci numbers) and derive explicit formulas r_k(Phi_n) = sum_{t=k-1}^{n-1} a_{n-1,t} * S(t, k-1) with efficiently computable binomial coefficients, achieving O(n^2) computation per component. Unlike classical chromatic polynomials, which assign identical features to all n-vertex 2-trees, bichromatic constraints provide informative structural features. While not complete graph invariants, these features capture meaningful structural properties through connections to Fibonacci polynomials, Bell numbers, and independent set enumeration. Applications include Byzantine fault tolerance in hierarchical networks, VM allocation in cloud computing, and secret-sharing protocols in distributed cryptography.